Overview

Dataset statistics

Number of variables13
Number of observations528
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory53.8 KiB
Average record size in memory104.2 B

Variable types

NUM8
CAT5

Warnings

animal_category has constant value "528" Constant
species has constant value "528" Constant
killed_and_disposed_of_thsdhd is highly skewed (γ1 = 20.04431811) Skewed
new_outbreaks is highly skewed (γ1 = 21.95958064) Skewed
cases_thsdhd has 91 (17.2%) zeros Zeros
deaths_thsdhd has 212 (40.2%) zeros Zeros
killed_and_disposed_of_thsdhd has 214 (40.5%) zeros Zeros
new_outbreaks has 502 (95.1%) zeros Zeros
slaughtered_thsdhd has 416 (78.8%) zeros Zeros
susceptible_thsdhd has 115 (21.8%) zeros Zeros
vaccinated_thsdhd has 434 (82.2%) zeros Zeros

Reproduction

Analysis started2022-04-08 16:50:54.624686
Analysis finished2022-04-08 16:51:07.940368
Duration13.32 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

year
Real number (ℝ≥0)

Distinct11
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2015.25
Minimum2011
Maximum2021
Zeros0
Zeros (%)0.0%
Memory size4.1 KiB
2022-04-08T09:51:08.009376image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum2011
5-th percentile2011
Q12013
median2015
Q32017
95-th percentile2020
Maximum2021
Range10
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.729069042
Coefficient of variation (CV)0.00135420868
Kurtosis-0.8381832564
Mean2015.25
Median Absolute Deviation (MAD)2
Skewness0.1788019969
Sum1064052
Variance7.447817837
MonotocityIncreasing
2022-04-08T09:51:08.147410image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%) 
20176812.9%
 
20156612.5%
 
20166512.3%
 
20145911.2%
 
20135410.2%
 
20115310.0%
 
2012519.7%
 
2018428.0%
 
2020295.5%
 
2019275.1%
 
ValueCountFrequency (%) 
20115310.0%
 
2012519.7%
 
20135410.2%
 
20145911.2%
 
20156612.5%
 
ValueCountFrequency (%) 
2021142.7%
 
2020295.5%
 
2019275.1%
 
2018428.0%
 
20176812.9%
 

country
Categorical

Distinct11
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
China (People's Rep. of)
109 
Germany
56 
Italy
51 
India
46 
Poland
42 
Other values (6)
224 
ValueCountFrequency (%) 
China (People's Rep. of)10920.6%
 
Germany5610.6%
 
Italy519.7%
 
India468.7%
 
Poland428.0%
 
Spain397.4%
 
United Kingdom387.2%
 
Brazil387.2%
 
France387.2%
 
Netherlands387.2%
 
2022-04-08T09:51:08.347912image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-04-08T09:51:08.510767image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length24
Median length7
Mean length11.625
Min length5

disease
Categorical

Distinct20
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
Low pathogenic avian influenza (poultry) (2006-2021)
124 
High pathogenicity avian influenza viruses (poultry) (Inf. with)
110 
Infectious bursal disease (Gumboro disease)
44 
Mycoplasma gallisepticum (Avian mycoplasmosis) (Inf. with)
34 
Avian infectious laryngotracheitis
32 
Other values (15)
184 
ValueCountFrequency (%) 
Low pathogenic avian influenza (poultry) (2006-2021)12423.5%
 
High pathogenicity avian influenza viruses (poultry) (Inf. with)11020.8%
 
Infectious bursal disease (Gumboro disease)448.3%
 
Mycoplasma gallisepticum (Avian mycoplasmosis) (Inf. with)346.4%
 
Avian infectious laryngotracheitis326.1%
 
Avian chlamydiosis315.9%
 
Fowl typhoid295.5%
 
Avian infectious bronchitis275.1%
 
Avian mycoplasmosis (M.synoviae) (2006-)264.9%
 
Newcastle disease virus (Inf. with)224.2%
 
Other values (10)499.3%
 
2022-04-08T09:51:08.664498image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique3 ?
Unique (%)0.6%
2022-04-08T09:51:08.827271image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length96
Median length52
Mean length43.93181818
Min length12
Distinct45
Distinct (%)8.5%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
-
292 
H5N8
41 
H5N1
37 
H5N2
31 
H7N7
 
14
Other values (40)
113 
ValueCountFrequency (%) 
-29255.3%
 
H5N8417.8%
 
H5N1377.0%
 
H5N2315.9%
 
H7N7142.7%
 
H5N3132.5%
 
H5N6112.1%
 
H5112.1%
 
H7N9101.9%
 
H7N381.5%
 
Other values (35)6011.4%
 
2022-04-08T09:51:08.996628image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique22 ?
Unique (%)4.2%
2022-04-08T09:51:09.166081image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length14
Median length1
Mean length2.695075758
Min length1

animal_category
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
Domestic
528 
ValueCountFrequency (%) 
Domestic528100.0%
 
2022-04-08T09:51:09.384705image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-04-08T09:51:09.497633image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:09.582270image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length8
Median length8
Mean length8
Min length8

species
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
Birds
528 
ValueCountFrequency (%) 
Birds528100.0%
 
2022-04-08T09:51:09.867687image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2022-04-08T09:51:09.967948image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:10.046092image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length5
Median length5
Mean length5
Min length5

cases_thsdhd
Real number (ℝ≥0)

ZEROS

Distinct368
Distinct (%)69.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean118.7286534
Minimum0
Maximum6755.467
Zeros91
Zeros (%)17.2%
Memory size4.1 KiB
2022-04-08T09:51:10.184253image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.005
median0.9005
Q338.1195
95-th percentile513.8006
Maximum6755.467
Range6755.467
Interquartile range (IQR)38.1145

Descriptive statistics

Standard deviation488.087932
Coefficient of variation (CV)4.110953152
Kurtosis105.7498766
Mean118.7286534
Median Absolute Deviation (MAD)0.9005
Skewness9.309903519
Sum62688.729
Variance238229.8293
MonotocityNot monotonic
2022-04-08T09:51:10.346943image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
09117.2%
 
0.001152.8%
 
0.00291.7%
 
0.00481.5%
 
0.00561.1%
 
0.01261.1%
 
0.0250.9%
 
0.0150.9%
 
0.00940.8%
 
0.00340.8%
 
Other values (358)37571.0%
 
ValueCountFrequency (%) 
09117.2%
 
0.001152.8%
 
0.00291.7%
 
0.00340.8%
 
0.00481.5%
 
ValueCountFrequency (%) 
6755.46710.2%
 
5747.00310.2%
 
3435.73810.2%
 
3328.08310.2%
 
2379.56710.2%
 

deaths_thsdhd
Real number (ℝ≥0)

ZEROS

Distinct283
Distinct (%)53.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.396933712
Minimum0
Maximum132.684
Zeros212
Zeros (%)40.2%
Memory size4.1 KiB
2022-04-08T09:51:10.516268image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.048
Q35.6935
95-th percentile44.6209
Maximum132.684
Range132.684
Interquartile range (IQR)5.6935

Descriptive statistics

Standard deviation20.2450065
Coefficient of variation (CV)2.410999919
Kurtosis15.35531113
Mean8.396933712
Median Absolute Deviation (MAD)0.048
Skewness3.691911105
Sum4433.581
Variance409.8602881
MonotocityNot monotonic
2022-04-08T09:51:10.701293image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
021240.2%
 
0.001132.5%
 
0.00530.6%
 
0.01530.6%
 
0.02430.6%
 
0.0530.6%
 
0.12220.4%
 
0.00320.4%
 
0.09920.4%
 
0.00220.4%
 
Other values (273)28353.6%
 
ValueCountFrequency (%) 
021240.2%
 
0.001132.5%
 
0.00220.4%
 
0.00320.4%
 
0.00410.2%
 
ValueCountFrequency (%) 
132.68410.2%
 
130.36710.2%
 
129.02910.2%
 
125.75710.2%
 
119.84310.2%
 

killed_and_disposed_of_thsdhd
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct301
Distinct (%)57.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean84.56050189
Minimum0
Maximum15749.788
Zeros214
Zeros (%)40.5%
Memory size4.1 KiB
2022-04-08T09:51:10.870653image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.2155
Q320.9615
95-th percentile249.2598
Maximum15749.788
Range15749.788
Interquartile range (IQR)20.9615

Descriptive statistics

Standard deviation717.426243
Coefficient of variation (CV)8.484176736
Kurtosis433.9482518
Mean84.56050189
Median Absolute Deviation (MAD)0.2155
Skewness20.04431811
Sum44647.945
Variance514700.4141
MonotocityNot monotonic
2022-04-08T09:51:11.055598image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
021440.5%
 
0.00150.9%
 
0.01430.6%
 
0.00230.6%
 
0.00520.4%
 
0.03220.4%
 
30.00320.4%
 
0.04620.4%
 
0.10220.4%
 
5.26720.4%
 
Other values (291)29155.1%
 
ValueCountFrequency (%) 
021440.5%
 
0.00150.9%
 
0.00230.6%
 
0.00310.2%
 
0.00410.2%
 
ValueCountFrequency (%) 
15749.78810.2%
 
2917.49510.2%
 
2104.28510.2%
 
1623.13610.2%
 
1431.45210.2%
 

new_outbreaks
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct11
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8806818182
Minimum0
Maximum320
Zeros502
Zeros (%)95.1%
Memory size4.1 KiB
2022-04-08T09:51:11.218410image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum320
Range320
Interquartile range (IQR)0

Descriptive statistics

Standard deviation14.14384512
Coefficient of variation (CV)16.06010801
Kurtosis494.597061
Mean0.8806818182
Median Absolute Deviation (MAD)0
Skewness21.95958064
Sum465
Variance200.0483548
MonotocityNot monotonic
2022-04-08T09:51:11.349933image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%) 
050295.1%
 
1112.1%
 
450.9%
 
230.6%
 
510.2%
 
710.2%
 
810.2%
 
1210.2%
 
2910.2%
 
4710.2%
 
ValueCountFrequency (%) 
050295.1%
 
1112.1%
 
230.6%
 
450.9%
 
510.2%
 
ValueCountFrequency (%) 
32010.2%
 
4710.2%
 
2910.2%
 
1210.2%
 
810.2%
 

slaughtered_thsdhd
Real number (ℝ≥0)

ZEROS

Distinct112
Distinct (%)21.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.64864773
Minimum0
Maximum1361.17
Zeros416
Zeros (%)78.8%
Memory size4.1 KiB
2022-04-08T09:51:11.503436image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile35.75175
Maximum1361.17
Range1361.17
Interquartile range (IQR)0

Descriptive statistics

Standard deviation89.72647
Coefficient of variation (CV)6.574019038
Kurtosis131.490435
Mean13.64864773
Median Absolute Deviation (MAD)0
Skewness10.5788297
Sum7206.486
Variance8050.839418
MonotocityNot monotonic
2022-04-08T09:51:11.688459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
041678.8%
 
0.00320.4%
 
0.03110.2%
 
0.04710.2%
 
0.02910.2%
 
0.02910.2%
 
1.10710.2%
 
11610.2%
 
0.08210.2%
 
39.13810.2%
 
Other values (102)10219.3%
 
ValueCountFrequency (%) 
041678.8%
 
0.00320.4%
 
0.00410.2%
 
0.00510.2%
 
0.00910.2%
 
ValueCountFrequency (%) 
1361.1710.2%
 
993.56110.2%
 
662.39910.2%
 
481.56710.2%
 
478.60610.2%
 

susceptible_thsdhd
Real number (ℝ≥0)

ZEROS

Distinct403
Distinct (%)76.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean696.7772102
Minimum0
Maximum73346.311
Zeros115
Zeros (%)21.8%
Memory size4.1 KiB
2022-04-08T09:51:11.873445image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.1215
median20.6495
Q3210.149
95-th percentile2908.04115
Maximum73346.311
Range73346.311
Interquartile range (IQR)210.0275

Descriptive statistics

Standard deviation3706.089959
Coefficient of variation (CV)5.318902375
Kurtosis283.6831165
Mean696.7772102
Median Absolute Deviation (MAD)20.6495
Skewness15.18517536
Sum367898.367
Variance13735102.78
MonotocityNot monotonic
2022-04-08T09:51:12.036258image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
011521.8%
 
0.00130.6%
 
0.00420.4%
 
3.01820.4%
 
1220.4%
 
5.26720.4%
 
31.98520.4%
 
320.4%
 
0.1320.4%
 
0.520.4%
 
Other values (393)39474.6%
 
ValueCountFrequency (%) 
011521.8%
 
0.00130.6%
 
0.00420.4%
 
0.00610.2%
 
0.01110.2%
 
ValueCountFrequency (%) 
73346.31110.2%
 
17750.01310.2%
 
16628.55710.2%
 
15476.53510.2%
 
15309.17510.2%
 

vaccinated_thsdhd
Real number (ℝ≥0)

ZEROS

Distinct95
Distinct (%)18.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean72.410875
Minimum0
Maximum3423.02
Zeros434
Zeros (%)82.2%
Memory size4.1 KiB
2022-04-08T09:51:12.221281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile473.61095
Maximum3423.02
Range3423.02
Interquartile range (IQR)0

Descriptive statistics

Standard deviation298.8545384
Coefficient of variation (CV)4.12720518
Kurtosis53.97638818
Mean72.410875
Median Absolute Deviation (MAD)0
Skewness6.532894762
Sum38232.942
Variance89314.03511
MonotocityNot monotonic
2022-04-08T09:51:12.390607image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
043482.2%
 
983.36110.2%
 
40.09910.2%
 
877.35810.2%
 
11.84610.2%
 
314.76610.2%
 
113.710.2%
 
1021.94710.2%
 
8.46710.2%
 
433.8410.2%
 
Other values (85)8516.1%
 
ValueCountFrequency (%) 
043482.2%
 
0.01510.2%
 
0.210.2%
 
0.20210.2%
 
0.46510.2%
 
ValueCountFrequency (%) 
3423.0210.2%
 
2936.48510.2%
 
2149.91310.2%
 
1552.68810.2%
 
1532.87310.2%
 

Interactions

2022-04-08T09:50:55.257431image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:55.413834image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:55.557680image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:55.711502image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:55.858757image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:56.005875image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:56.190792image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:56.360113image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:56.513775image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:56.729605image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:56.914506image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:57.077343image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:57.277927image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:57.509480image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:57.678817image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:57.879454image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:58.048809image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:58.211561image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:58.380961image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:58.565864image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:58.750815image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:58.913502image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:59.082787image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:59.252168image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:59.421434image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:59.599902image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:59.769324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:50:59.938754image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:00.139216image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:00.317687image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:00.487024image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:00.656347image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:00.825556image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:01.041752image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:01.288870image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:01.489012image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:01.720899image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:01.905793image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:02.444613image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:02.645154image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:02.845658image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:02.992850image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:03.177471image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:03.362453image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:03.531304image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:03.694078image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:03.910410image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:04.079716image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:04.249041image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:04.417395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:04.598908image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:04.771490image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:04.952759image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:05.115570image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:05.300569image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:05.469915image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:05.654812image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:05.817546image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:06.021026image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:06.185583image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:06.340866image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:06.581550image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:06.748150image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:06.986467image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2022-04-08T09:51:12.559923image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-04-08T09:51:12.760488image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-04-08T09:51:13.023152image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-04-08T09:51:13.278813image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-04-08T09:51:13.494706image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-04-08T09:51:07.276104image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2022-04-08T09:51:07.793223image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

yearcountrydiseaseserotype_subtype_genotypeanimal_categoryspeciescases_thsdhddeaths_thsdhdkilled_and_disposed_of_thsdhdnew_outbreaksslaughtered_thsdhdsusceptible_thsdhdvaccinated_thsdhd
02011BrazilAvian infectious bronchitis-DomesticBirds556.25438.5761.53000.0231335.5990.000
12011BrazilAvian mycoplasmosis (M.synoviae) (2006-)-DomesticBirds2379.5671.8911.94000.8403071.6800.000
22011BrazilFowl cholera (-2011)-DomesticBirds12.2634.5770.21500.470249.0820.000
32011BrazilInfectious bursal disease (Gumboro disease)-DomesticBirds513.30411.9170.00000.000593.1240.000
42011BrazilMycoplasma gallisepticum (Avian mycoplasmosis) (Inf. with)-DomesticBirds214.8443.4480.00100.009262.0240.000
52011China (People's Rep. of)Avian infectious bronchitis-DomesticBirds675.61761.29230.88405.0116144.5772936.485
62011China (People's Rep. of)Avian infectious laryngotracheitis-DomesticBirds297.01421.0366.12800.9512944.0321259.010
72011China (People's Rep. of)Duck virus hepatitis-DomesticBirds150.11347.87320.78107.426912.188324.633
82011China (People's Rep. of)Fowl cholera (-2011)-DomesticBirds350.78272.47043.43203.6246265.8971552.688
92011China (People's Rep. of)Fowl typhoid-DomesticBirds3.3080.9530.08000.04565.3423.826

Last rows

yearcountrydiseaseserotype_subtype_genotypeanimal_categoryspeciescases_thsdhddeaths_thsdhdkilled_and_disposed_of_thsdhdnew_outbreaksslaughtered_thsdhdsusceptible_thsdhdvaccinated_thsdhd
5182021IndiaHigh pathogenicity avian influenza viruses (poultry) (Inf. with)H5N8DomesticBirds23.12016.59958.10740.0170.2880.0
5192021ItalyHigh pathogenicity avian influenza viruses (poultry) (Inf. with)H5N1DomesticBirds0.0100.0000.00010.08.9600.0
5202021ItalyHigh pathogenicity avian influenza viruses (poultry) (Inf. with)H5N8DomesticBirds0.0780.0590.03700.00.0960.0
5212021ItalyLow pathogenic avian influenza (poultry) (2006-2021)H5N7DomesticBirds0.0000.0000.00000.03.0180.0
5222021ItalyLow pathogenic avian influenza (poultry) (2006-2021)H7N7DomesticBirds0.0340.0333.06800.03.0180.0
5232021NetherlandsHigh pathogenicity avian influenza viruses (poultry) (Inf. with)H5N8DomesticBirds0.6000.60064.57520.065.1750.0
5242021PolandHigh pathogenicity avian influenza viruses (poultry) (Inf. with)H5N8DomesticBirds19.3353.43824.26720.027.7050.0
5252021United KingdomHigh pathogenicity avian influenza viruses (poultry) (Inf. with)H5N1DomesticBirds0.1030.0993.10510.03.2040.0
5262021United KingdomHigh pathogenicity avian influenza viruses (poultry) (Inf. with)H5N8DomesticBirds0.0000.0000.00010.00.0010.0
5272021United KingdomInfluenza A viruses of high pathogenicity (Inf. with) (non-poultry including wild birds) (2017-)H5N1DomesticBirds0.0120.0110.00110.00.0120.0